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Abstract 

We introduce constraints necessary for type checking a higher-order concurrent 
constraint language, and solve them with an incremental algorithm. Our constraint 
system extends rational unification by constraints sCy saying that a x has at least 
the structure of y" , modelled by a weak instance relation between trees. This notion 
of instance has been carefully chosen to be weaker than the usual one which renders 
semi-unification undecidable. Semi-unification has more than once served to link uni- 
fication problems arising from type inference and those considered in computational 
linguistics. Just as polymorphic recursion corresponds to subsumption through the 
semi-unification problem, our type constraint problem corresponds to weak subsump- 
tion of feature graphs in linguistics. The decidability problem for weak subsumption 



for feature graphs has been settled by Dorre [D6r94|. In contrast to Dorre's, our algo- 
rithm is fully incremental and does not refer to finite state automata. Our algorithm 
also is a lot more flexible. It allows a number of extensions (records, sorts, disjunctive 
types, type declarations, and others) which make it suitable for type inference of a 
full-fledged programming language. 

Keywords: type inference, weak subsumption, unification, constraints, constraint 
programming 
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1 Introduction 



We give an algorithm which is at the heart of a type diagnosis system for a higher-order 
concurrent constraint language, viz. the 7 calculus [Smo94] which is the underlying opera- 
tional model of the programming language Oz |[ST94|| . The algorithm decides satisfiability 
of constraints containing equations x—y and x—f(y), and weak subsumption constraints 
xC.y over infinite constructor trees with free variables. The algorithm is given fully in 
terms of constraint simplification. One the one hand, this gives credit to the close rela- 
tionship between type inference and constraint solving (e.g., | |Wan87| , |AW93| , |KPS94| | and 
many others). On the other hand it establishes yet another correspondence between uni- 
fication problems arising from polymorphic type inference and unification based grammar 
formalisms: The most prominent one is the equivalence of type checking polymorphic recur- 
Myc84 , Hen88 1 with semi- unification ||KTU93| , [DR90| both of which are undecidable in 



sion 



general. To avoid this undecidability, we chose a weaker instance relation to give semantics 
to iC?/. For example, we allow f(a b) as an instance of f(x x) even if a ^ b. On the type 
side, this type of constraints maintains some of the polymorphic flavour, but abandons full 



parametric polymorphism [MN95 



We start out from the set of infinite constructor trees with holes (free variables). We give 
a semantics which interprets the tree assigned to a variable dually: As itself and the set 
of its "weak" instances. Our algorithm terminates, and can be shown to be correct and 
complete under this semantics. The decidability problem for our constraints turned out to 
be equivalent to weak subsumption over feature graphs solved by Dorre [ Dor94 ] for feature 
graphs with feature (but no arity) constraints. 

However, only half of Dorre's two-step solution is a constraint solving algorithm. The 
second step relies on the equivalence of non-deterministic and deterministic finite state 
automata. In contrast, our algorithm decides satisfiability in a completely incremental 
manner and is thus amenable to be integrated in an concurrent constraint language like 



Oz |ST9i or AKL jJH9l 



The extension of our algorithm towards feature trees is easily possible (see [MJN95]). This 
allows to do type diagnosis for records ||ST92|| and objects. An entirely set-based semantics 
allows to naturally extend the algorithm to a full-fledged type diagnosis system, covering 
- among other aspects - sorts, disjunctive types, and recursive data type declarations 
NPT93II. 
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Type Diagnosis. As an illustrating example for the form of type diagnosis we have in 
mind, consider the following 7 program: 



3x3y3z3p p:uv /v=cons(x u) A pyy A x=f(y z) 

This program declares four variables x, y, z, and p. It defines a relational abstraction p, 
which states that its two arguments u and v are related through the equation v = cons(xu)^\ 
Furthermore, it states the equality x=f(y z) and applies p to yy. This application pyy 
reduces to a copy of the abstraction p with the actual arguments yy replaced for the formal 
ones uv : 

3x3y3z3p p:uv /v=cons(x u) A pyy Ax=f(yz) 
— > 3x3y3z3p p:uv /v=cons(x u) A y=cons(x y) A x=f(y z) 

Observe how the abstraction p is defined by reference to the global variable x, while the 
value of x is defined through an application of p: pyy A x=f(yz). Such a cycle is specific to 
the 7 calculus since no other language offers explicit declaration of logic variables global to 
an abstraction (be it logic, functional, or concurrent languages, e.g., Prolog, ML [ HMM86 
or Pict |PT95l ). 



The types of the variables involved are described by the following constraint .0 For ease of 
reading, we slightly abuse notation and pick the type variables identical to the correspond- 
ing object variables: 

p=(u v) A v=cons(x u) A y^u A yCv A x=f(y z) 

(u v) is the relational type of p, and the application gives rise to the constraint yCu A 
yOv, which says that y is constrained by both formal arguments of the procedure p. The 
subconstraint x=f(y z) A y(^v A v=cons(x u) reflects the cyclic dependency between x 
and p. It says that y be in the set of instances of v which depends through v=cons(x u) 
on x, and at the same time that x should be exactly f(yz). 

Type diagnosis along this line is discussed in depth in |MN95| . 



Related Work. Apart from the already mentioned work, related work includes investi- 
gations about membership constraints (e.g., ||NPT93] ]), type analysis for untyped languages 



(Soft Typing) ||AW93| , |CF9T| |WC93|| , constraint-based program analysis |[KPS94|| and the 



1 Note that p:uv /v=cons(x u) is different from a named A abstraction p — \u.cons(x u) because it is 
relational rather than functional, and also different to the Prolog program p(u, v) :— v = cons(xu)., because 
Prolog does not allow variables to be global wrt. a predicate but rather existentially quantifies x. 



The formal account of the derivation of type constraints from programs will be given in [Miil96] 
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derivation of recursive sets from programs [F5VY9I] . For proofs and a detailed discussion 
of related work see IMN95I . 



Plan of the Paper. This paper is structured as follows. In the Section Q below we 
present our constraints along with their semantics and give necessary notation. Section |] 
gives a simple algorithm which is correct but non-terminating. Section [| gives the rules of 
the full algorithm. Section [j| concludes and gives a brief outlook. 



2 Constraints and Semantics 

We assume a signature S of function symbols with at least two elements ranged over by 
f,g,h,a,b,c and an infinite set of base variables BV ranged over by \- If V is a further 
set of variables then XT(V) stands for the set of all finite or infinite trees over signature S 
and variables V. Trees of 1T{V) are always ranged over by s and t. The set of variables 
occurring in a tree t is denoted by V(t). Sequences of variables are written as x, or % 

We build constraints over a set of constraint variables ranged over by x, y, z, u, v, w. 
Constraint variables must contain at least base variables. The syntax of our constraints 0, 
ij) is as follows: 

x, y ::= x an d 0, "0 ::= X= V I x= fiy) I X ^V | A "0 

As atomic constraints we consider equations x=y or x=f(y) and weak subsumption con- 
straints xCy. Constraints are atomic constraints closed under conjunction. First-order 
formulae build over constraints are denoted by We define = to be the least binary 
relation on such that A is associative and commutative. For convenience, we shall use 
the following notation: 

in -0 iff exists 0' with A 0' = ip 

As semantic structures we pick tree-structures which we also call 1T{V) for some set V. 
The domain of a tree-structure 1T{V) is the set of trees IT(V). Its interpretation is 
defined by f XT(v \t) = f(t). We define the application f(T) of / to a sequences of sets of 
trees T elementwise, f(T) = {f(t) \ t G T}. Given a tree s G 2T(V), the set Insty(s) of 
weak instances of s is defined as the greatest fixed point of: 

/ \ J TT{V) iit = x for some x 
lnst v (s) = j f(\ ns t v (s)) if t = f(s) for some s 
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Notice that this definition implies f\a b) G lnsty(/(x x)), even if a ^ b. Let V\, 
V2 be two sets whose elements we call variables. A V1-V2- substitution a is a mapping 
from V\ to XT(X^). By homomorphic extension, every substitution can be extended 
to a mapping from ZT(Vi) to TTiVz). The set of strong instances of s is defined by 
Inst'y(s) = {o~(s) I a is a V(s)-V-substitution}. Note that Inst'y(s) C Insty(s), and that 
f(ab) £ lnstV(/(^ x)) if a 7^ b. Using InstV(s) instead of Insty(s) would make satisfiability 
of our constraints equivalent to semi-unification and undecidable [ KTU90 . DR90|| . 



Let a be a Vi-VVsubstitution, {x,y,z} C Vi, and <f>,if) constraints such that V(0) C V\, 
V(i/j) C V\. Then we define: 



\=„x=y iff a(x)=a(y) \= a x=f(z) iff a(x)=f XT W(a(y)) 

^vXCy iff lnsty 2 (<t(x)) C \nst V2 (a(y)) |= CT A ip iff |= CT and -0 

A Vi-T^2 -solution of is a Vi-T^-substitution satisfying \= a 0. A constraint is called 
satisfiable, if there exists a Vi-V2-solution for 0. The notion of \= a extends to arbitrary 
first-order formulae $ in the usual way. We say that a formula $ is valid, if \= a $ holds 
for all V\- ^-substitutions a with V($) C 1^. In symbols, |= <3>. 

Our setting is a conservative extension of the usual rational unification problem. This 
means that free variables in the semantic domain do not affect equality constraints. A 
constraint is satisfiable in the tree-model IT(V), if there exists a £>V-V-solution of 0. 
The trees of XT(0) are called ground trees. 

Proposition 2.1 Suppose not to contain weak subsumption constraints. Then is sat- 
isfiable if and only if it is satisfiable in the model of ground trees. 

The statement would be wrong for 0's containing weak subsumption constraints. For 
instance, consider the following with a 7^ b: 

= xCz A yCz A x=a A y=b 

This is not satisfiable in the model of ground trees, since the set Inst0(i) is a singleton for 
all ground trees t, whereas any Vi-V^-solution a of has to satisfy {a, b} C lnsty 2 (<7(2:)). 
However, there exists a {x, y, z}-{v }-solution a of 0, where {v} is an singleton: a(x) = 
a,a(y) = b,a(z) = v. 

Proposition 2.2 For all x, y, z, u, v the following statements hold: 

1) \= x =y-*xCy, 2) |= xCy A yCz — > xCz , 3) (= x=f(y) -+xCf(y) 

4) y= xC.y A y^x — > x=y 5) \t= x—f(u v) A xCj/ A y—f(z z) — > w=t> . 
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Weak Subsumption vs. Sets of Weak Instances. In the remainder of this section we 
compare our sets of weak instance with Dorre's notion of weak subsumption. Let us con- 
sider constructor trees as special feature trees with integer-valued features, a distinguished 
feature label (e.g., ||NP93| , Pac94|| ), and a distinguished feature arity. Given feature con- 
straints x[f]y saying that x has direct subtree y at feature /, the equation x=f(yi . . .y n ) 
can be considered equivalent to:0 

x[arity]n A a^label]/ A x[l]yi A ... A x[n]y n . 

Let us write s[f]l to say that the tree s has some direct subtree at /. A simulation between 
1T(Vi) and ZT(V 2 ) is a relation A C JT(Vi) x IT(V 2 ) satisfying: If (t, s) E A then 

(Arity Simulation) If t [label] J, and there is an n such that £[arity]n, then 

s[arity]n. 

(Feature Simulation) If t[f]{ and there is a tree t' such that t[f]t', then s [/]],, 

s[f]s', and (f, s')eA. 



Now, the weak subsumption preorder □ is defined by: 

tz\ v s iff there is a simulation A C V x V such that (s, t) £ A 
We have the following lemma: 

Lemma 2.3 For all constructor trees s,t it holds that: Insty(s) C Insty(t) iff s~3 v t. 

A similar statement can be derived for the set of strong instances and a strong subsump- 
tion preorder following |Dor94| . The difference between Z2 V and Dorre's notion of weak 
subsumption is that he does not require (Arity Simulation), while we naturally do since we 
start from constructor trees. For type checking, constructor trees seem more natural: For 
illustration note that the arity of a procedure is essential type information. 



3 A Non-terminating Solution 

In order to solve our constraints one could come up with the system given in Figure 
Besides the three usual unification rules for rational trees, the only additional rule is (De- 
scend). This algorithm is correct and very likely to be complete in that for an unsatisfiable 

3 This simpler encoding of constructor trees not using arity constraints has been suggested by one of 
the referees. 
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(Decom) 


x—f(u) A 
u—v A 


x—f(v) in 0. 


(Clash) 


x—f(u) A 

JL 


:r=5f(t;) in0, and / ^ g. 


(Elim) 


a;=y A 
z=y A 0[y/x] 


x G V(0), and x ^ y. 


(Descend) 


xCy A 


u fresh, y=f(z) in 0. 


x—f(u) A SC2 A 



Figure 1: A Non- Terminating Algorithm 



constraint there is a derivation from to _L However, this intuitive algorithm loops due 
to the introduction of new variables. 

xCy A y=f{x) Descend x ~ y A ^ = -^) 

x=f(xi) A xiCx A y=/(ar) Descend x =fM A A l/ = /(^) 

Note that some form of descending is necessary in order to derive the clash from the 
inconsistent constraint y—f(u) A u=a A z—f(x) A xCy A xCz A 

4 Algorithm 

To consider trees with free variables as set of instances means that we need to compute 
intersections of such sets and to decide their emptiness. When we simplify xCy A x<Zz in a 
context 0, we have to compute the intersection of the sets of instances of y and z. In order 
to avoid the introduction of new variables we add a new class of variables to represent such 
intersections, and one new constraint. Intersection variables are defined as nonempty finite 
subsets of base variables. In order capture the intended semantics, we write Xi 1 " 1 • • • ^Xn 
instead of {xi} U . . . U {Xn}- The equality = on intersection variables is the equality on 
powersets, which satisfies: 

xHy = yC\x, (xny)nz = xn(yC\z), xHx = x. 

We call ania component of y,ify = xC\z for some z. The set of components of a variable x 
is denoted by C(x). Note that xHy G V(0) implies x G C(V(0)) but in general not x G V(0). 
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As additional constraint we introduce xQf(y), with the semantics: 



H xCf(y) 3u(xCu A u=f(y)) . 

Complete semantics has to take care of intersection variables such as yC\z. Constraint 
solving will propagate intersection variables into most constraint positions. That is, our 
algorithm actually operates on the following constraints: 

x, y ::= \ I X ^V an d <f>,i> '■'■= x=y \ x=f(y) | xCy | xCf(y) | A tj) 

However, if started with a constraint containing only base variables, our algorithm main- 
tains this invariant for the equational constraints. 

Let us call a variable x immediately determined by f, in 0, written xof(y), if one of x=f(y) 
or x<Of(y) is in for some f(y)- We say that x is immediately determined in if it 
is immediately determined by some / in 0. Call x determined, written x<0/(«) if x is 
immediately determined in 0, or x^ydz and yof[u) are in 0. Obviously, if x<^,/(y), then 
the top-level constructor of x must be /. 

We define the application of an operator [y/x] to intersection variables, only if x is a base 
variable. If z = (xiC . . . H^n), then z[y/x] we define: 

AvIA = xi[y/x]n. . .nxn[y/x] . 

We say that [y/x] applied to intersection variables performs deep substitution. The following 
property holds for deep substitution: 

C{V{x=y A 0)) = C{V{x=y A <p[y/x])) . 

Note however that V(x=y A 0) ^ V(x=y A <p[y/x}) if = zCxdy. The variable xDy 
is contained in the first but not in the second set. We can now specify our algorithm for 
constraint simplification. It is given by the rules in Figure ^| and Figure |3|. 

The Rule (Decom) is known from usual unification for rational trees. Up to the application 
condition x G C(V(0)) PI BV, this also applies to rule (Elim). This side condition accounts 
for deep substitution. The (Clash) rule contains as special cases: 

x=f(y) A x=g(z) A ^ g xCf(y) A xCg(z) A j 

Its full power comes in interaction with the rules in Figure ^. Then it allows to derive 
a clash if for a variable x a constructor is known, and for some variable xT\y a distinct 
constructor is derivable. 



S 



(Decom) ^ y z=/(v) m 0. 

(Clash) x<^f(u), xr)y<<t,g(v), and / ^ 

(Elim) ^[y/xj X G C(V(0)) n and X ^ 



Figure 2: Rational Tree Unification 



(Propagatel) 


ifl?/Cz A 
xflyC^nu A 


iCk in 0, zflw ^ z. 


(Propagate2) 


xC\yC.f(u) A 

xnyc/^n^) a <p 


X<cj>f(v), uHv ^ M. 


(Collapse) 


xCyn^riu a 


yCz in 0, and yfl^n-u ^ yHz. 


(Descendl) 


x=f{u) A 


x<4>f(v), 


x=f{u) A mCw A 


u^vHw not in 


(Descend2) 




xr\yQf(u) A 


xny g V(0), x<tf(u), 

and not xr\yog(y) in 0. 



Figure 3: Simplifying Membership Constraints 



Rules (Propagatel) and (Propagate2) propagate intersection variables into the right hand 
side of weak subsumption contraints. The (Collapse) rule collapses chains of variables 
related via weak subsumption constraints. In other words, these rules propagate lower 
bounds with respect to the weak subsumption relation. 

The rules (Descendl) and (Descend2) replace (Descend) from the non-terminating algo- 
rithm in Figure [l]. The Descend rules are the only rules introducing new weak subsumption 
constraints. The rule (Descend2) introduces a constructor for intersection-variables xdy 
by adding a constraint of the form xr\y^f(u). If the rule is applied, then the intersection 
of x and y is forced to be nonempty. Nonemptiness is implied by 0, if xdy occurs in 
(xDy E V(0)). 

Note that (Descendl) and (Descend2) are carefully equipped with side conditions for ter- 
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mination. For example, the following derivations are not possible: 

x=f(u) xCy A x=f(x) A xCf(y) x=f(y) 

x—f{u) A xQf(u) xCy A xCy A x—f(x) A xQf(y) yCy A x—f(y) 

We can prove that our algorithm performs equivalence transformations with respect to sub- 
stitutions a which meet the intended semantics of intersection variables, i.e., intersection- 
correct substitutions: 

Definition 4.1 (Intersection Correct) We say that a substitution a is intersection- 
correct for x and y, if it satisfies: 

a(xC\y) = a(x) fl a(y) . 

We say that a substitution a is intersection-correct, if the following properties holds for all 
intersection variables x,y and z: 

If x, y, xC\y G dom(<r) ; then a is intersection- correct for x andy. 
If x, xC\y G dom((x) ; then a is intersection- correct for xC\y andy. 

Note that a is intersection-correct for x and x(~)y, iff a(x(~)y) C a(x). We call a constraint 
intersection-satisfiable, if has an intersection-correct solution. 

Proposition 4.2 Let <fi be a constraint only containing base variables only. Then <fi is 
satisfiable, if and only if it is intersection satisfiable. 

We denote the set of all intersection-correct solutions of with Sol 7 (0). Assume a to be a 
substitution. A V -extension of a is a substitution a such that dom(o-) = dom((r) U V such 
that o and a coincide on dom(o"). We denote the set of all intersection-correct K-extensions 
of a with Exty(cr). Let and -0 be constraints. We say that intersection-implies -0, 
written \ =I ip, if 

Ext(, w (Sol / (0)) C Sol 7 (^) and Sol 7 (0) = iff Ext(, w (Sol 7 (0)) = 

We call and -0 intersection-equivalent if | =/ -0 and -0 | =/ 0, and write \=\ r ip. Both 
conditions ensure the following Lemma: 

Lemma 4.3 If is not intersection satisfiable, then \= J -0 /io/(is vacuously for all -0. 
Furthermore, if 1=^ -0, t/ien is intersection satisfiable if and only if -0 is. 
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Given the above notions, the following two theorems are our main results. For the proofs 
the reader is referred to [|MN95 . 



Theorem 4.4 (Termination) The rule system given in Figures [| and terminates. 

Theorem 4.5 (Correctness and Completeness) Let <fi be a constraint containing base 
variables only. Then the following statements are equivalent: 

1. (p is intersection-satisfiable. 

2. There exists an irreducible ip ^ _L derivable from </>. 

3. There exists a irreducible ip ^ _L that is intersection-equivalent to <p. 
4- -L cannot be derived from <p. 
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5 Outlook 

We have presented an algorithm for deciding satisfiability of weak subsumption constraints 
over infinite constructor trees with holes. Our motivation to solve such constraints grew 
out of a type inference problem. Formally, the problem is equivalent to type checking 
a weak form of polymorphic recursion. Type checking polymorphic recursion is equiva- 
lent to semi-unification and to subsumption of feature graphs. All three are undecidable 
[Hen88| , [KTU93| , pR90|| . We establish a similar correspondence between a type inference 



problem and weak subsumption of feature graphs: The latter has been investigated by Dorre 
looking for a logical treatment of coordination phenomena in unification based grammar 



formalisms |Dor94|. Our starting point from the constraint language Oz however lead us 



to an incremental algorithm, in contrast to the automata based solution of Dorre. 
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